Advanced Regular Expressions
https://github.com/ziishaned/learn-regex
Sample Problems
Problem 1
Which of the following strings match the regular expression pattern "^w{3}.([a-z0-9]([-a-z0-9]{0,61}[a-z0-9])+.)+[a-z0-9][-a-z0-9]{0,61}[a-z0-9]" ?
- www.google.com
- www.-petsmart.com
- www.edu-.ro
- www.google.co.in
- www.examples.c.net
- www.edu.training.computer-science.org
- www.everglades_holidaypark.com
Solution: This Regular Expression matches a domain name used to access web sites.
RE starts with the subdomain www, continues with a number of names of domains, separated by a dot (Top-level domain (TLD), Second-level domain (SLD), Third-level domain, and so on).
The name of a domain contains only small letters, digits and hyphen. The name can’t begin and can’t finish with a hyphen character. The length of the domain’s name is minimum 2 and maximum 63 characters.
^ : the string starts with www , followed by a dot character;
[a-z0-9] : the first and the last character of the domain's name can be only a small letter or a digit;
[-a-z0-9]{0,61} : the next characters can be small letters, digits or a hyphen character. Maximum 61 characters;
The last sequence [a-z0-9][-a-z0-9]{0,61}[a-z0-9] is for the Top-Level domain, which is not followed by a dot.
The strings that are represented by this pattern are 1, 4 and 6.
Problem 2
Write a regular expression describing a set of strings formed to the following rules:
- Contain only lowercase letters of the English alphabet and the character '.';
- Start and end with the same letter;
- Contain a sequence of at least one and at most 3 vowels, separated by zero or more characters '.' of a sequence consisting of at least one consonant.
Solution:
The Regular Expression is:
([a-z])[a,e,i,o,u]{1,3}.*[b-df-hj-np-tv-z]+(\1)
([a-z]) represents the group number 1 that captures the firs letter;
\1 is the number of the group that appears at the end of the string;
[a,e,i,o,u]{1,3} describes sequence of one to three vowels;
.* the character ‘.’ appears zero to more times;
[b-df-hj-np-tv-z]+ a sequence of consonants, at least one consonant.